Long-Term Outcome of Graves' Disease: A Gender Perspective

Introduction: In gender-skewed conditions such as Graves' disease (GD), the outcome naturally becomes dominated by the majority. This may lead to gender-biased misunderstandings regarding treatment outcomes. This especially holds true when complications, such as depression, are unevenly distributed. We have, therefore, studied the long-term outcome of GD from a gender perspective. Materials and Methods: A cohort of 1186 patients with GD was included in a follow-up 6–10 years after inclusion. Choice of treatment, the feeling of recovery, long-term treatment, comorbidity, and quality of life were investigated with questionnaires. All results were studied sex-divided. Results: We included 973 women and 213 men. There was no difference between men and women in the choice of treatment. At follow-up, women scored significantly worse in the general questionnaire 36-item Short-Form Health Status (SF-36) domain bodily pain and in the thyroid-specific Thyroid-Related Patient-Reported Outcome (ThyPRO) domains depression, impaired sex life, and cosmetic complaints, all p < 0.05. Women were twice as likely (29.5%) to be treated with levothyroxine after successful treatment with antithyroid drugs (ATD) compared with men (14.9%, p < 0.05). Conclusion: After treatment for GD, women were more affected by depression, impaired sex life, cosmetic issues, and bodily pain despite successful cure of hyperthyroidism. The prevalence of hypothyroidism was also doubled in women. Whether these observed gender differences reflect a worse outcome of GD in women or a natural consequence of a higher prevalence of these symptoms and autoimmunity in the female population is difficult to disentangle. Nevertheless, several years after GD, women reveal more persistent symptoms.

Keywords: quality of life; long-term follow-up; Graves disease Introduction Graves' disease (GD), like most autoimmune conditions, affects women more often than men (4:1). 1 This gender difference in prevalence also exists for some of the long-term complications of GD. 2 An example is depression, which is described as both a complication of GD and a gender-skewed condition in the general population with higher prevalence in females. 3,4When a condition, such as GD, and an outcome, such as depression, is unequally distributed among men and women, the investigated outcomes inevitably become dominated by the majority's outcome.
A consequence of these basic facts is that the individual doctor in the clinic is much more likely to meet a female than a male GD patient and that this group of females more commonly will be depressed compared with the men.But does that mean that depression as a long-term complication of GD is more common in women compared to in men?If so, it is important to be aware of early signs and if it is not, this is important to avoid preconceptions that sometimes arise in gender-skewed conditions.
Treatment of GD is usually performed with antithyroid drugs (ATD), offering a possibility of cure after long-term treatment, or permanent ablation of the thyroid with radio active iodine (RAI), or with total thyroidectomy. 5Following successful ablative therapy, patients require levothyroxine substitution. 6In Sweden, the first choice of treatment is dominated by ATD. 7 Recurrence after ATD treatment is common, depending on follow-up, in 50%-55%. 7Many patients change to ablative treatments during the course of the disease. 7[9][10][11] In untreated GD, classical symptoms are tachycardia, tremor, sweating, weight loss, and mental symptoms.With treatment, symptoms usually become less prominent, but recovery can be incomplete.[14][15] Investigations on the effect of choice of treatment modality on follow-up QoL have yielded conflicting results: no difference in some investigations 16,17 and worse outcomes in RAI-treated patients compared with ATD or surgically treated patients in one study. 15f there are differences between men and women in the initially chosen treatment, or in QoL many years after diagnosis, they are not well investigated.
A few gender differences have, however, been noted: men have been reported to have a greater likelihood of relapse after ATD treatment. 18,19Women have a higher incidence of the complication thyroid-associated ophthalmopathy (TAO) although the most severe forms are dominated by men. 20To elucidate further gender differences, we investigated the TT-12 cohort.This consists of patients with a first-ever diagnosis of GD registered in Southern Sweden 2003-2005. 21A followup study was performed in 2011-2013, and the QoL results of the whole cohort (1186 patients) have previously been published. 15his investigation aimed at examining whether treatment and long-term outcome in GD differ between the sexes.We hypothesized that the longterm consequences of GD would be worse in women compared with men due to the higher prevalence of depression and anxiety in the female general population.We also hypothesized that hypothyroidism as a complication would be more prevalent in females due to higher general autoimmunity.

Study design
A detailed description of the study design has previously been published. 7In brief, the study was a pragmatic trial, investigating the effectiveness of ATD, RAI, and surgical therapy in standard practices throughout Sweden.In 2003-2005, all patients with newly diagnosed GD or toxic nodular goiter (TNG; n = 2916) were registered in 13 endocrine clinics that covered a referral population corresponding to 40% of the Swedish population. 21ll GD patients had biochemically verified disease with suppressed TSH, and elevated free T4 and TSHreceptor antibody (TRAb).In patients with negative TRAb, disease was verified with a radio uptake scan.
All GD patients were contacted for a longitudinal evaluation of outcome between 6 and 10 years after the original diagnosis.From the original cohort of 2916 hyperthyroid patients, the following groups were excluded: patients with TNG (n = 640), patients with an uncertain diagnosis (n = 8), children (n = 64), deceased patients (n = 137), emigrated patients (n = 24), and patients lost to follow-up (n = 67).Altogether, 1976 men and women with a history of de novo GD were invited to participate in a follow-up study.

Subjects
A total of 1186 (60%) of the GD patients agreed to participate, whereas 194 (10%) women and 51 (3%) men declined and 428 (22%) women and 117 (6%) men did not respond.All were sent a clinical questionnaire comprising 68 questions together with 2 QoL questionnaires described next.In addition, the medical history records of the 1186 included participants were reviewed concerning their treatment modality, the order of treatments, and the recurrence of disease after each treatment course.The medical history record was also reviewed if verification of issues such as diagnosis, treatment, or occurrence of TAO was needed.
Participants were compared with healthy controls without a history of thyroid disease from one Swedish 22 and one Danish 23 sample.

Treatment strategies for GD in Sweden 2003-2005
During this time period in Sweden, ATD was most frequently administered as block-and-replacement treatment for 12-18 months.This approach was used to promote long-term remission in GD patients without the need for additional ATD or levothyroxine replacement.
Depending on disease severity and local practice, most patients (98%) were prescribed 10-15 mg methimazole twice daily or 100 mg propylthiouracil thrice daily with the addition of levothyroxine after 2-4 weeks.After 2-6 months of ATD treatment, the long-term strategy was discussed with the patient, and a decision was made to continue ATD or to choose an ablative treatment.
The reasons for ablative thyroid therapy included pregnancy wishes, adverse effects of ATD, presence of goiter, other compromising medical conditions exposing the patient to a higher medical risk, the persistence of high TRAbs stimulating the thyroid hormone synthesis, as well as the patient's preference.
ATD treatment was generally given for 12-18 months, and a decision of discontinuation was based on a normalized or near-normalized TRAb level.In case of relapse, and if a second treatment period of ATD therapy was chosen, it was continued for a minimum of 12 months followed by a repeated measurement of TRAb to determine whether the medication could be stopped.Patients were generally informed of the risk of future recurrence and recommended ablative treatment.
RAI therapy was administered as one dose of I-131 aiming at achieving hypothyroidism.The dose of I-131 activity was calculated individually in 464 out of 505 patients (91.9%) using the thyroid volume, percentage of 24-hour I-131 uptake, and after 5-7 days (T1/2) to achieve an absorbed dose of 120 Gray to the thyroid gland.In 41 out of 505 patients (8.1%), RAI therapy was given as a fixed dose.
Surgical treatment was total thyroidectomy in 265 out of 278 patients (95%) except in 13 out of 278 (5%) who underwent subtotal thyroidectomy.

Definition of parameters used for evaluation
First-line treatment was defined as the initially chosen mode of treatment.
A treatment period was defined as the period from the start of treatment until remission was expected.For ATD, this period was between 12 and 18 months.Remission was defined as persistent euthyroidism 3-6 months after ATD and levothyroxine discontinuation.Patients who discontinued ATD were generally followed for 6 months.If the patients did not reappear in the files and did not mention relapse in their questionnaire, we defined them as in remission.

Questionnaires
For the study, we developed a 68-item clinical patient questionnaire described in the previous publication. 7he questionnaire comprised questions regarding demographic data, comorbidity, treatment modality, and a question as to whether the patient felt recovered from the thyroid disease.
General QoL was assessed with (1) the 36-item Short-Form Health Status (SF-36) 24 and (2) thyroidspecific QoL with the Thyroid-Related Patient-Reported Outcome (ThyPRO). 25,26Both were validated in the included populations. 25,27ta management and validation Answers to questionnaires were auto-scanned and transferred into a database by using scanning software (Remark Office OMR 8 a ; Remark, Malvern, PA).The auto-scanning was validated and corrected manually.The database was validated and corrected with crosschecking of key points with the medical history files from 5% of randomly selected patients.

Statistics
The demographic data were analyzed using Pearson's chi-square or Fisher exact test for categorical variables and an unpaired t-test for continuous variables.Statistical analyses were performed using IBM SPSS Statistics 27.0.1.064-bit edition (SPSS Institute, Chicago, IL).Statistical significance was set at p < 0.05.Differences in QoL mean scale scores were evaluated with multiple linear regression and were adjusted for age and comorbidity (yes or no).Correction for multiple testing was done with the Benjamini-Hochberg procedure.

Ethics
The study was approved by the Regional ethical committee in Uppsala (Dnr 2012/035, April 4, 2012), and all appropriate processes have been followed.Informed consent by participants was obtained by mail.The study was performed according to the Declaration of Helsinki.

Results
Results for the whole group have been published previously. 7In this publication, we compare these results between the included men and women.
The mean follow-up time was 8 years (range 6-10), and 973 women and 213 men were included resulting in a female-to-male ratio of 4.6:1.The mean age at diagnosis was 49 (-14 standard deviation [SD]) in men and 47 (-14 SD) in women.There was no significant difference in the proportion of men and women regarding smoking status, country of origin, the prevalence of TAO, or comorbidities (Table 1).

Long-term results of first-line treatment
There were no significant differences between men and women in the chosen first-line treatment, the proportion in remission after first-line treatment, who changed to ablative treatment, or the proportion who were in remission after changing treatment (Table 2).

Quality of life
In SF-36, women scored significantly worse than men in the domain bodily pain.When compared with controls of the same sex from the Swedish general popula-tion (median age 60), female GD patients scored worse in the SF-36 domains bodily pain, vitality, and mental health, whereas male patients scored worse in the domains vitality and mental health.
In ThyPRO, female GD patients scored significantly worse in the domain's depression, impaired sex life, and cosmetic complaints compared with male patients.Both male and female patients scored worse in all ThyPRO domains except tiredness compared with a healthy Danish population (median age 50) (Table 3).In the remaining domains in both questionnaires, there were no significant differences.

Levothyroxine treatment
After ablative treatment, there was no difference between men and women in the proportion who required levothyroxine supplementation but after successful treatment with ATD only, women were twice as likely (29.5%) to be treated with levothyroxine compared with men previously treated for GD (14.9%), p < 0.05 (Table 4).Feeling of recovery At follow-up, both men and women who in the patient questionnaire reported that they did not feel recovered were more likely to be on levothyroxine substitution (33.2%) compared with those who felt recovered (13.9%).In women who had ablative treatment at any time, those reporting not feeling recovered were more likely to be on levothyroxine substitution (33.3%) compared with those who felt recovered (17.4%), p < 0.05.This difference was not seen in men.There was a trend toward a higher proportion of patients only treated with ATD reporting not feeling recovered (42.2%) compared with those who had ablative treatment at any time (32.0%),p = 0.065 (Table 5).

Discussion
In this long-term follow-up, we found higher scores for depression, bodily pain, complaints about sex life, and cosmetic features in women compared with men.To our knowledge, sex-divided long-term outcome of treatment of GD has not previously been reported.
Among patients cured by ATD therapy, women were also twice as likely to subsequently be treated with levothyroxine compared with men.The initial treatment choice did not differ between men and women, in agreement with previously published index studies. 28,29There was neither any sex difference in relapse rate after first-line treatment, in contrast to previous studies. 18,19n the natural course of GD, hypothyroidism may develop, with varying degrees depending on follow-up. 7-111][32][33] The presence of blocking TRAbs 34 may theoretically influence the prevalence of hypothyroidism post-ATD treatment, although gender differences in TRAb blocking activity remain to be elucidated.In our study, unfortunately, neither levels of TPOAb nor blocking TRAbs were known.
Women are also known to more often seek primary medical care 35,36 and the physician may in such instances prescribe levothyroxine if thyroid failure is verified.Palpable goiter is more common in women than in men in the general population, 37 the association with hypothyroidism is unclear but a palpable goiter in a general practitioner's office will most often result in thyroid hormones being measured.
According to a 2021 research letter, 91.5% of levothyroxine treatment in the United States between 2008 and 2018 was initiated in patients with subclinical hypothyroidism (61%) or with normal thyroid levels (30.5%). 38In Denmark, 18% of patients prescribed levothyroxine were initiated with TSH <5 mIU/L. 39This indicates a rather low threshold for when to prescribe substitution.
Support for a lowered threshold over time also comes from an investigation in the United Kingdom displaying that the median TSH level at the initiation of levothyroxine therapy fell from 8.7 to 7.9 mIU/L   40 The situation in Sweden has not been investigated but the Danish, US, and UK studies raise the question as to whether the higher levothyroxine prescription in women is more associated with a higher prevalence of symptoms typical of hypothyroidism and/or a higher consumption of primary medical care. 35,36n the patient group successfully treated with only ATD, the proportion who did not feel recovered was significantly higher in both men and women on levothyroxine treatment compared with patients without substitution.The reason for this may be multifactorial: (1) a situation with life-long medication influences the patient's feeling of recovery, (2) patients who do not feel recovered may be investigated more often and thus the likeliness of finding a slightly elevated TSH increases, (3) doctors may be more prone to prescribe levothyroxine to patients who do not feel well and this might be gender-divided, (4) developing hypothyroidism after GD may be a sign of increased autoimmunity that may affect well-being independently of thyroid hormone levels, and (5) levothyroxine treatment is not optimal to restore well-being in all hypothyroid patients.
The findings that women are more affected by bodily pain, depression, impaired sex life, and cosmetic complaints may all be influenced by the fact that a large proportion of the female participants likely were postmenopausal at the time of follow-up.The possible influence of menopause on the findings will be discussed briefly under each separate finding.The finding that women are more affected by bodily pain as a long-term consequence of GD should be interpreted with caution.An increased prevalence of pain in GD-treated patients has been reported in a few previous studies. 14,41,42Although considerably different in design, none of these reported men and women separately.
However, the prevalence of chronic pain in the general population is generally considered to be higher in women compared with men. 27,43Post-menopausal women are also reported to have greater pain symptoms compared with pre-menopausal women. 44rom the result of this study, it is not possible to state whether depression is more common as a long-term consequence of GD in women compared with men.Two previous studies in a Swedish context have concluded that the difference in lifetime prevalence of depression in Sweden is almost twice as high in women compared with men. 45,46Thus, a higher prevalence of depression in women should be expected, whether the person is a GD patient or has no thyroid disease.Menopause has also been reported to increase the prevalence of depression. 47he questions about impaired sex life in ThyPRO contain both a question about negative influence and one about decreased sexual desire.In the more general question of sex-life satisfaction, there is no previously known difference between men and women, 48 but lack of sexual desire is known to be more prevalent in women. 49e effect of hyperthyroidism on men's and women's sex life has been summarized in a review, 50 but the included investigations are generally too small to allow for reliable conclusions.In summary, our findings on impaired sex life should be interpreted with caution and are likely affected by the previously described lower sexual desire in women in general or by the effect of menopause on sexuality. 51omplaints of physical appearance, which may be related to goiter, ophthalmopathy, or weight gain after treatment, 52 were more common among women compared with men.Although no gender difference in the prevalence of TAO was found in our study,  both goiter and ophthalmopathy are previously reported to be more common in women. 20,37Male sex has been reported to be a risk factor for excess weight gain after treatment for hyperthyroidism, 53 but most studies have found no difference in weight gain between men and women. 54Another explanation can be a gendered cultural impact where the importance of physical appearance and body dissatisfaction, in general, is higher in women compared with men. 55n the comparison with population norms, there was a discrepancy between the general instrument SF-36 and the thyroid-specific ThyPRO, where the latter appears to be more sensitive to discovering remaining symptoms in GD.The presence of a significant difference in both sexes in the SF-36 domain mental health highlights the long-term mental effects of GD in general.
In ThyPRO, all but one domain were significantly different from population norms, in agreement with previously described remaining symptoms long after treatment.The only domain difference that was not significant was tiredness, a symptom frequently described in the clinic.The Swedish sample used for population norms in SF-36 was generally older than the included participants and therefore the results are most likely an underestimation.

Strengths and limitations
Strengths of this investigation are that all treatment modalities have been used, the cohort is large, and there is a considerable long follow-up.
Limitations are that data on the exact duration of the ATD treatments are lacking and that we do not have access to the degree of goiters, weight, menopausal status, or laboratory results during follow-up.Also, we do not have access to data if some patients treated with thyroid hormone substitution could be over-or undertreated, which could influence QoL.
We could also not assess whether patients were treated with levothyroxine as monotherapy or in combination with triiodothyronine, although monotherapy dominates in Sweden.The response rate in the initial cohort is also limited to 60%, often found in this kind of investigation, and is an inherent limitation.
One weakness in this investigation is the lack of biochemical verification with TSH and free T4 to discern at which threshold levothyroxine was prescribed, and whether there was a gender difference in these levels.

Conclusion
In this long-term follow-up in a large cohort of patients with GD, we found that women were twice as likely to be treated with levothyroxine substitution compared with men although cured from hyperthyroidism by ATD only.Women were in the long term more affected by depression, impaired sex life, cosmetic issues, and bodily pain despite successful cure of hyperthyroidism.
Whether these observed gender differences reflect a worse outcome of GD in women or a natural consequence of a higher prevalence of these symptoms and autoimmunity in the female population is difficult to disentangle.Nevertheless, several years after GD, women reveal more persistent symptoms.

a
At the follow-up after the first-line treatment only.b Due to adverse reactions, development or worsening of TAO, persistent or high levels of TRAb, patient wishes, or desired pregnancy.c At follow-up after changed treatment during the first treatment period.d Total number of patients in remission after first-line treatment inclusive of patients who underwent change of first-line treatment.ATD, antithyroid drugs; GD, Graves' disease; RAI, radio active iodine; TRAb, TSH-receptor antibody.Calissendorff, et al.; Women's Health Reports 2023, 4.1 http://online.liebertpub.com/doi/10.1089/whr.2023.0073 between 2001 and 2009, further indicating a change in attitude in prescribing doctors.

Table 2 .
Long-Term Result of First-Line Treatment with Antithyroid Drugs, Radioiodine, Surgery, or Conservative (i.e., Observation Possibly Combined with Beta-Blocker Therapy)

Table 3 .
Self-Reported Quality of Life in 1186 Patients 6-10 Years After First Episode of Graves' Disease compared with GD patients of the opposite sex and with controls of the same sex from the general population.All results are adjusted for age and comorbidity.p-Values <0.05 after adjustment and Hochberg correction are marked in bold.compared with the female patients, f compared with the male patients.SF-36, 36-item Short-Form Health Status; ThyPRO, Thyroid-Related Patient-Reported Outcome.Calissendorff, et al.; Women's Health Reports 2023, 4.1 http://online.liebertpub.com/doi/10.1089/whr.2023.0073 a Swedish population norms, b compared with the female patients, c compared with the male patients.d Danish population norms, e

Table 5 .
Self-Reported Feeling of Recovery in 1186 Patients 6-10 Years After First Episode of Graves' Disease

Table 4 .
Total Number of Patients in the Treatment Groups and the Extent of Levothyroxine Treatment at Follow-Up 6-10 Years After Diagnosis